Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions. In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches.
translated by 谷歌翻译
我们考虑了在透明的蜂窝车辆到所有物品(C-V2X)系统中的联合渠道分配和电力分配的问题,其中多个车辆到网络(V2N)上行链路共享与多个车辆到车辆的时频资源( v2v)排,使连接和自动驾驶汽车的团体可以紧密地一起旅行。由于在车辆环境中使用高用户移动性的性质,依赖全球渠道信息的传统集中优化方法在具有大量用户的C-V2X系统中可能不可行。利用多机构增强学习(RL)方法,我们提出了分布式资源分配(RA)算法来克服这一挑战。具体而言,我们将RA问题建模为多代理系统。仅基于本地渠道信息,每个排领导者充当代理,共同相互交互,因此选择了子频段和功率水平的最佳组合来传输其信号。为此,我们利用双重Q学习算法在同时最大化V2N链接的总和率的目标下共同训练代理,并满足所需延迟限制的每个V2V链接的数据包输送概率。仿真结果表明,与众所周知的详尽搜索算法相比,我们提出的基于RL的算法提供了紧密的性能。
translated by 谷歌翻译
这项研究介绍了我们对越南语言和语音处理任务(VLSP)挑战2021的文本处理任务的医疗保健领域的自动越南图像字幕的方法作为编码器的体系结构和长期的短期内存(LSTM)作为解码器生成句子。这些模型在不同的数据集中表现出色。我们提出的模型还具有编码器和一个解码器,但是我们在编码器中使用了SWIN变压器,LSTM与解码器中的注意模块结合在一起。该研究介绍了我们在比赛期间使用的培训实验和技术。我们的模型在vietcap4h数据集上达到了0.293的BLEU4分数,并且该分数在私人排行榜上排名3 $^{rd} $。我们的代码可以在\ url {https://git.io/jddjm}上找到。
translated by 谷歌翻译
心血管疾病(CVD)是一组心脏和血管疾病,是对人类健康最严重的危险之一,此类患者的数量仍在增长。早期,准确的检测在成功治疗和干预中起着关键作用。心电图(ECG)是识别各种心血管异常的金标准。在临床实践和当前大多数研究中,主要使用标准的12铅ECG。但是,使用较少的铅可以使ECG更加普遍,因为可以通过便携式或可穿戴设备来方便地记录它。在这项研究中,我们开发了一种新颖的深度学习系统,以仅使用三个ECG铅来准确识别多个心血管异常。
translated by 谷歌翻译
算法追索权旨在推荐提供丰富的反馈,以推翻不利的机器学习决策。我们在本文中介绍了贝叶斯追索权,这是一种模型不足的追索权,可最大程度地减少后验概率比值比。此外,我们介绍了其最小的稳健对应物,目的是对抗机器学习模型参数的未来变化。强大的对应物明确考虑了使用最佳传输(Wasserstein)距离规定的高斯混合物中数据的扰动。我们表明,可以将最终的最差目标函数分解为求解一系列二维优化子问题,因此,最小值追索问题发现问题可用于梯度下降算法。与现有的生成健壮的回流的方法相反,可靠的贝叶斯追索不需要线性近似步骤。数值实验证明了我们提出的稳健贝叶斯追索权面临模型转移的有效性。我们的代码可在https://github.com/vinairesearch/robust-bayesian-recourse上找到。
translated by 谷歌翻译
多摄像机多对象跟踪目前在计算机视野中引起了注意力,因为它在现实世界应用中的卓越性能,如具有拥挤场景或巨大空间的视频监控。在这项工作中,我们提出了一种基于空间升降的多乳制型配方的数学上优雅的多摄像多对象跟踪方法。我们的模型利用单摄像头跟踪器产生的最先进的TOOTWLET作为提案。由于这些Tracklet可能包含ID-Switch错误,因此我们通过从3D几何投影获得的新型预簇来完善它们。因此,我们派生了更好的跟踪图,没有ID交换机,更精确的数据关联阶段的亲和力成本。然后通过求解全局提升的多乳制型制剂,将轨迹与多摄像机轨迹匹配,该组件包含位于同一相机和相互相机间的Tracklet上的短路和远程时间交互。在Wildtrack DataSet的实验结果是近乎完美的结果,在校园上表现出最先进的追踪器,同时在PETS-09数据集上处于校准状态。我们将在接受纸质时进行我们的实施。
translated by 谷歌翻译
客户端之间的非独立和相同分布(非IID)数据分布被视为降低联合学习(FL)性能的关键因素。处理非IID数据(如个性化FL和联邦多任务学习(FMTL)的几种方法对研究社区有很大兴趣。在这项工作中,首先,我们使用Laplacian正规化制定FMTL问题,明确地利用客户模型之间的关系进行多任务学习。然后,我们介绍了FMTL问题的新视图,首次表明配制的FMTL问题可用于传统的FL和个性化FL。我们还提出了两种算法FEDU和DFEDU,分别解决了通信集中和分散方案中的配制FMTL问题。从理论上讲,我们证明了两种算法的收敛速率实现了用于非凸起目标的强大凸起和载位加速的线性加速。实验,我们表明我们的算法优于FL设置的传统算法FedVG,在FMTL设置中的Mocha,以及个性化流程中的PFEDME和PER-FEDAVG。
translated by 谷歌翻译
Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.
translated by 谷歌翻译
In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.
translated by 谷歌翻译